Towards mixed language speech recognition systems
نویسندگان
چکیده
Multilingual speech recognition obviously involves numerous research challenges, including common phoneme sets, adaptation on limited amount of training data, as well as mixed language recognition (common in many countries, like Switzerland). In this latter case, it is not even possible to assume that one knows in advance the language being spoken. This is the context and motivation of the present work. We indeed investigate how current state-of-the-art speech recognition systems can be exploited in multilingual environments, where the language (from an assumed set of five possible languages, in our case) is not a priori known during recognition. We combine monolingual systems and extensively develop and compare different features and acoustic models. On SpeechDat(II) datasets, and in the context of isolated words, we show that it is actually possible to approach the performances of monolingual systems even if the identity of the spoken language is not a priori known.
منابع مشابه
Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملStatistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language
Setup of an emotion recognition or emotional speech recognition system is directly related to how emotion changes the speech features. In this research, the influence of emotion on the anger and happiness was evaluated and the results were compared with the neutral speech. So the pitch frequency and the first three formant frequencies were used. The experimental results showed that there are lo...
متن کاملTowards Natural Language Understanding of Partial Speech Recognition Results in Dialogue Systems
We investigate natural language understanding of partial speech recognition results to equip a dialogue system with incremental language processing capabilities for more realistic human-computer conversations. We show that relatively high accuracy can be achieved in understanding of spontaneous utterances before utterances are completed.
متن کاملTowards best practice in the development and evaluation of speech recognition components of a spoken language dialog system
Spoken Language Dialog Systems (SLDSs) aim to use natural spoken input for performing an information processing task such as call routing or train ticket reservation (Lamel et al., 1995). The main functionality of an SLDS are speech recognition, natural language understanding, dialog management, response generation and the speech synthesis. This article summarizes key aspects of the current pra...
متن کاملTowards Acoustic Modeling of Lithuanian Speech
In this paper we present experimental investigation of using various phone sets for acoustic modeling of Lithuanian speech applied to large vocabulary continuous speech recognition. Paper presents specifics of Lithuanian speech acoustics including accentuation, diphthongs, softening and assimilation of consonants. The speech recognition experiments use only acoustic model since effective langua...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010